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Abstract. 

We discuss various statistical distributions of earthquake numbers. Previously we de- 
rived several discrete distributions to describe earthquake numbers for the branching model 
of earthquake occurrence: these distributions are the Poisson, geometric, logarithmic, and 
the negative binomial (NBD). The theoretical model is the 'birth and immigration' popula- 
tion process. The first three distributions above can be considered special cases of the NBD. 
In particular, a point branching process along the magnitude (or log seismic moment) axis 
with independent events (immigrants) explains the magnitude/moment-frequency relation 
and the NBD of earthquake counts in large time/space windows, as well as the dependence 
of the NBD parameters on the magnitude threshold (magnitude of an earthquake catalog 
completeness). We discuss applying these distributions, especially the NBD, to approximate 
event numbers in earthquake catalogs. There are many different representations of the NBD. 
Most can be traced either to the Pascal distribution or to the mixture of the Poisson distri- 
bution with the gamma law. We discuss advantages and drawbacks of both representations 
for statistical analysis of earthquake catalogs. We also consider applying the NBD to earth- 
quake forecasts and describe the limits of the application for the given equations. In contrast 
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to the one-parameter Poisson distribution so widely used to describe earthquake occurrence, 
the NBD has two parameters. The second parameter can be used to characterize clustering 
or over-dispersion of a process. We determine the parameter values and their uncertainties 
for several local and global catalogs, and their subdivisions in various time intervals, magni- 
tude thresholds, spatial windows, and tectonic categories. The theoretical model of how the 
clustering parameter depends on the corner (maximum) magnitude can be used to predict 
future earthquake number distribution in regions where very large earthquakes have not yet 
occurred. 
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1 Introduction 

Earthquake forecasts are important in estimating hazard and risk and in making informed 
decisions to manage emergency response. How can we establish standards for reporting and 
testing earthquake forecasts? One significant effort began in California, where the Regional 
Earthquake Likelihood Models (RELM) project published a dozen models for earthquake rate 
density and a likelihood based method for prospective testing (Field, 2007; Schorlemmer and 
Gerstenberger, 2007; Schorlemmer et al., 2007; Schorlemmer et al., 2009). The CoUaboratory 
for Study of Earthquake Predictability (CSEP) is currently extending the tests to several 
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natural laboratories around the globe. 

One standard test is to compare the number of predicted earthquakes with the actual 
number of events during the test period. To do so we need to know the statistical distribution 
of earthquake numbers. 

The standard assumption long used in earthquake hazard analysis (Cornell, 1968) is that 
earthquake occurrence is reasonably well described by the Poisson distribution. However, it 
has also been known for a long time that earthquakes are clustered in time and space: their 
distribution is over-dispersed compared to the Poisson law. One conventional way to treat 
this problem is to decluster an earthquake catalog (Schorlemmer et al., 2007). But there are 
several declustering procedures, mostly based on ad-hoc rules. Hence declustered catalogs 
are not unique and usually not fully reproducible. Therefore, it is important to derive and 
investigate earthquake number distribution in real earthquake catalogs. 

Kagan (1996), Jackson and Kagan (1999), and Kagan and Jackson (2000) have all used 
the negative binomial distribution (NBD) to approximate earthquake numbers in catalogs. 
The NBD has a higher variance than the Poisson law and can be shown (Kagan, 1973a,b) 
to result from the branching nature of earthquake occurrence. 

In principle, several other over-dispersed discrete distributions, such as generalized Pois- 
son distributions (Consul, 1989) or generalized negative-binomial distributions (Tripathi, 
2006; Hilbe, 2007) can be used to approximate earthquake numbers. However, the NBD 
has the advantage of relative simplicity and is supported by theoretical arguments (Kagan, 
1973a,b). As we discuss below, in addition to negative-binomial and Poisson distributions, 
several other statistical discrete distributions can describe earthquake numbers. A general 
discussion of such distributions can be found in Johnson et al. (2005) and Kotz et al. (2006). 

Over the years many papers have analyzed various aspects of earthquake numbers distri- 
butions; for example, see Vere- Jones (1970), Shlien and Toksoz (1970), Dionysiou and Pa- 
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padopoulos (1992). These publications investigated the distributions empirically by counting 
earthquake numbers in catalogs and trying to approximate them by various statistical laws. 
Here we explain these distributions as a consequence of the stochastic branching model. 

Therefore, in addition to the NBD and the Poisson distributions, in this study we will 
investigate the geometric and logarithmic distributions in several earthquake catalogs and 
show their applicability in certain conditions. After presenting the theoretical derivation 
of these distributions, we explore the statistical parameter estimation for these laws. Then 
we apply these methods to several earthquake catalogs. Two global (CMT and PDE) and 
one local Californian catalog are studied and the results are displayed in tables and dia- 
grams. These results can be used in earthquake forecast testing (Schorlemmer et ai, 2007; 
Schorlemmer et ai, 2009; Kagan et ai, 2009). 

2 Theoretical considerations 
2.1 Generating function for the NBD 

Several stochastic models of earthquake occurrence were proposed and almost all were based 
on the theory of branching processes (Kagan, 2006). Branching is expected to model the 
well-known property of primary and secondary clustering for aftershock sequences: a strong 
aftershock (or foreshock) tends to have its own sequence of dependent events. These multi- 
dimensional models are 

• (a) The supercritical point process branching along the magnitude axis, introduced by 
Kagan (1973a,b) and shown in Fig. [1^. Earthquake occurrence constitutes a down-going 
cascade in this model. 

• (b) Critical (or rather slightly subcritical) point process branching along the time axis 
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(Hawkes 1971; Kagan and Knopoff, 1987; Ogata, 1988) - often called the Hawkes self- 
exciting process or the ETAS model (see Fig. [T]d). Hawkes and Tukey debate (see discussion 
section in Kagan, 1973b) the difference between branching in earthquake size and that in 
time. Bremaud and Massoulie (2001) recently proposed a variant of Hawkes' process with 
no independent events (immigrants). However, in earthquake catalogs limited in time-span, 
we need to introduce independent events. Otherwise, models would be highly non-unique. 

Both models shown in Fig. [1] use the Poisson cluster process to approximate earthquake 
occurrence. Earthquake clusters are assumed to follow the Poisson occurrence. Earthquakes 
within a cluster are modeled by a multidimensional branching process, which reproduces a 
temporal-spatial pattern of dependent events (mostly aftershocks) around the initial one in 
a sequence (Kagan, 1973a,b; Kagan and Knopoff, 1987; Ogata, 1988, 1998). 

In one form or another, these models employ the classical statistical properties of earth- 
quake occurrence: the Gutenberg-Richter (G-R) relation and Omori's law. Model (a) repro- 
duces the G-R relation as the result of a supercritical branching along the magnitude axis, 
while the temporal distribution (the Omori-type law) must be imposed. In model (b) both 
the temporal and magnitude distributions are imposed. 

The simplest way to describe discrete distributions is by using the probability generat- 
ing function (Bartlett, 1978; Evans et al, 2000). Given the generating function (f){z), the 
probability function / (k) can be obtained as 



dz^ 



z=0 

Following the graph of inter-earthquake connections as shown in Fig. [TK, we investigate 
the earthquake numbers distributions for space-time intervals larger that the average dimen- 
sions of earthquake clusters. Thus, we neglect space-time differences between cluster mem- 
bers. We assume that the independent (mainshocks) and dependent (foreshocks-aftershocks) 



events are occurring along the S 
constant rates 



= logM axis (M is the scalar seismic moment) with a 



p {dE) — u • dE — const ■ dE , 
P (dE) — P ■ dE — const ■ dE, 

for5<S^, (2) 

where Sm is the logarithm of the maximum seismic moment, M^; i/ and (3 are rates of 
independent and dependent events, respectively. 

Events occurrence can be modeled as an 'immigration and birth' process, where indepen- 
dent, spontaneous earthquakes (mainshocks) are treated as 'immigrants'. Any immigrant 
may spawn offspring, who may spawn their own offspring, etc., with the whole family making 
up a cluster. Seismologists generally call the largest family member the 'mainshock', and the 
preceding smaller events the 'foreshocks', and subsequent members the 'aftershocks'. In this 
case the conditional generating function for the number of events with H > He in a cluster, 
including the mainshock, is (Bartlett, 1978, Eq. 3.4(7), p. 76) 

where (3 is the index of the seismic moment distribution, /3 = |6; h is the parameter of 
the G-R law (Kagan and Jackson, 2000; Bird and Kagan, 2004), and Mc is the moment 
detection threshold of a seismographic network (M > Mc). In this formula we modify 
Bartlett's equation for the 'birth and immigration' (but no 'deaths') population process. 
In future calculations here we take 

/? = 0.62 : (4) 
the value suggested by results in the Bird and Kagan (2004) and Kagan et al. (2009). 



Equation ([3]) characterizes the geometric distribution, with the probabihty function 



f{k) = {l-p)'-'p for fc = l,2,3,. 



A more common form of the geometric distribution (Evans et ai, 2000) is 



f(k) = (l-p)V for A; = 0,1, 2, 



(5) 



(6) 



It would be appropriate for the number of dependent shocks in a cluster. Its conditional 
generating function is 



1- z[l- {MjMf] ■ 
The geometric distribution is a discrete analog of the exponential distribution for continuous 
variables. For a small p-value 

f {k) pexp—kp. (8) 

For the distribution of the total number of events in an earthquake catalog we obtain 
(Bartlett, 1978, Ch. 3.41, p. 83) 



exp /_ [(f) {z \E) — 1] u dE 

log (l - ^[1 - (M,/M^)^]) _ 

-H^rn-Ec) ^ 
u/f3 



exp 



{MJM„ 



1 _ z[l - [MJM^Y] 
The above generating function is for the NBD. 



(9) 



In this derivation we assume that the 'immigration and birth' process starts at the max- 
imum moment: we assume the moment distribution to be a truncated Pareto distribution 
with a 'hard' limit on the maximum moment (Kagan and Jackson, 2000; Zaliapin at ai, 
2005; Kagan, 2006). However, a more appropriate seismic moment distribution (a tapered 
Pareto distribution) uses a 'soft' limit or a corner moment {ibid., Bird and Kagan, 2004). 



To consider this, Kagan and Jackson (2000, Eq. 29) propose some modification of the NBD 
formulas. 

The last line in (Q means that the distribution of events (counting mainshocks as well) in 
earthquake clusters would be logarithmic (or logarithmic series) with the probability function 



fik) 



1 - (MJM^)^ 



{MJM„ 



for A; = 1,2, 3, 



A; log(Me/M„)/3 k(3{E^-E,) 
The standard form of the logarithmic distribution (Evans et al, 2000) is 



(10) 



fik) 



for /c = 1, 2, 3, 



fill 



log(l — p) k 

For a small moment threshold, the logarithmic distribution can be approximated by a tapered 
Pareto distribution (Zaliapin et ai, 2005) 



f{k) ^ {k\og[{MjM,f]exp[k{MjMrnf]} ' for fc = l,2,3 



:i2) 



i.e., the clusters with small and intermediate mainshocks follow the Pareto style heavy-tail 
distribution. The exponential tail should be observed only for clusters with large mainshocks 
having a magnitude close to maximum. 

The average size for a group of dependent events is 



S{k) 



/3(S, 



(13) 



m ' — 'cl 



It is clear from Eqs. [TU] and [T3] that the number of dependent events decreases as Mm'- 
the largest earthquakes are distributed according to the Poisson law. 

The Poisson distribution has the probability function of observing k events as 

A''exp(-A) 



k\ 



(14) 



For this distribution its mean and variance are equal to its rate A. 
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What are the distributions of earthquake numbers for the branching-in-time model (see 
Fig. Can we calculate these distributions similarly to model (a)? Indeed, when the 

first event in a sequence of earthquakes is the largest one, as is often the case, there is 
little ambiguity in calling it a mainshock, and the other dependent events may be called 
aftershocks, though some of them are actually aftershocks of aftershocks. In such a case 
there is little difference between (a) and (b) models. If, however, the first event in a sequence 
is smaller than the subsequent events, it is typically called a foreshock. Retrospectively, it 
is relatively easy to subdivide an earthquake catalogue into fore-, main-, and aftershocks. 
However, in real-time forecasting it is uncertain whether the most recent event registered by 
a seismographic network is a foreshock or a mainshock. Although the subsequent triggered 
events are likely to be smaller and would thus be called aftershocks, there is a significant 
chance that some succeeding earthquakes may be bigger than the predecessors that triggered 
them (Kagan, 1991). 

The difficulty in model (b) is that some connections between events are not observable. 
Suppose that there is a potential foreshock-mainshock pair: a larger earthquake is preceded 
by a smaller one. However, the first event is below the magnitude threshold and the larger 
earthquake is above the threshold. Then this second event would be treated as independent 
(immigrant); that is, our calculations would miss this connection (Kagan, 1991; Sornette 
and Werner, 2005; Kagan, 2006). It is possible that the number of such missed connections 
is small, and the distributions are similar to those derived above. Nevertheless, such dis- 
tribution estimates for this model would be approximate. For the branching-in-magnitude 
model (Fig. [1^), the derived expressions are exact. 

The equations obtained in this subsection are based on the theory of population pro- 
cesses. In simple models of such processes (Bartlett, 1978), we assume that individuals start 
to reproduce immediately after their birth. This is not the case for an earthquake occurrence: 
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after a large event, the aftershocks even large ones cannot be observed for a period of time 
(Kagan, 2004). Moreover, earthquakes are not instantaneous, point events; and their rup- 
ture takes time. Hence, the derived formulas would over-estimate the number of dependent 
events. However, as we will see below, during analysis of the earthquake catalogs the the- 
oretical models provide some valuable insight into the quantitative behavior of earthquake 
occurrence. 

2.2 NBD distribution expressions 

There are many different (often confusing) representations of the NBD (see several exam- 
ples in Anscombe, 1950; Shenton and Myers, 1963). The most frequently used (we call 
it standard) form of the probability density function for the NBD generalizes the Pascal 
distribution (Feller, 1968, Eq. VI.8.1; Hilbe, 2007, Eq. 5.19): 

f(k) = ^(r + l)...(r + t-2)(r + ;c-l) ^ + 

k\ y T — 1 J 

where k = 0,1,2,..., T is the gamma function, < ^ < 1, and r > 0. If the parameter 
T is integer, then this formula (Pascal distribution) is the probability distribution of a cer- 
tain number of failures and successes in a series of independent and identically distributed 
Bernoulli trials. For k + t Bernoulli trials with success probability 6, the negative bino- 
mial gives the probability for k failures and r successes, with success on the last trial. If 
r = 1, this equation corresponds to the geometric distribution. Therefore, the latter 
distribution can be considered a special case of the NBD. 
The average of k for the NBD is 

E{k) = mi = ri^, (16) 
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and its variance 



m = m, = T^. (17) 



The negative binomial distribution generally has a larger standard deviation than the Poisson 
law. Thus, it is often called an 'over-dispersed Poisson distribution' (Hilbe, 2007). For 6^1 
and r(l — 6) ^ X expression ( fTSl) tends to ( fT4ll (Feller, 1968, p. 281); the negative binomial 
distribution becomes the Poisson distribution; the latter distribution is a special case of the 
former. 

Anraku and Yanagimoto (1990, Eq. 1.1) and Hilbe (2007, Eqs. 5.9 or 5.29) propose the fol- 
lowing distribution density form, which they obtain as a mixture of the Poisson distributions 
with the gamma distributed rate parameter A (see, for example, Hilbe, 2007, Eqs. 5.1-5.9 or 
7.21-7.33) 

^ T{l/a + k) X' ^ X^ Tjl/a + k) 1 

' r(l/a)A;! (A + l/«)'= (1 + aA)V" k\ r(l/«) (A + l/a)'= (1 + aA)i/" ' ^ ' 

which converges to the geometric distribution if a = 1 , but to the Poisson distribution (fill) 
when a — >■ , 

f{k) = ^-l-^TTT- (19) 

^ ^ k\ exp(A) ^ ^ 

Comparison with 0151) shows that a = 1/t and X = t {1 — 9)/9 . Distribution (1181) is called 

an alternative form of the NBD in Wikipedia and we accept this term for our purposes. 

In addition to the above representations of the NBD, we use Evans' (1953) expres- 
sions. Evans provides the parameter uncertainty equations for the estimates by the moment 
method. The formula for the probability density is 

_ nX/a + k) [a/{l + a)r .^Ol 
^^"^ - TiX/a)k\ (l + a)Va ' 



If we make a = a X, this equation converts to flTSl) . 
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The probability generating function (Bartlett, 1978) for the NBD standard form is 



i-z{i-e 

For the alternative form (fT8|) of the NBD the generating function is 



(21) 



Comparing fl21l) with ([H]), it is clear that if the moment threshold is close to M^, the NBD 
approaches the Poisson distribution for earthquake counts. This is confirmed by extensive 
seismological practice (see also Figs. [T3] and below) . 

After comparing ( |2Ti) with iQ, we propose the following relations between the parameters 

and 

T = u/p. (24) 

However, as mentioned at the end of Subsection 12. H such relations are valid only when larger 
shocks start producing dependent events immediately after their rupture. In earthquake 
catalogs there is a significant delay of aftershock generation (Kagan, 2004). Thus, these 
expressions are over-estimated. Moreover, in all available earthquake catalogs there are very 
few or no mainshocks with a size approaching the maximum magnitude. Therefore, we would 
not expect the observational 6'-estimates to be close to that of fl23l) . However, as we will see 
the dependence of 6 on the magnitude threshold, the maximum magnitude, and (3 is visible 
in earthquake records. 
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2.3 Statistical parameter estimation 

For the Poisson distribution f[T^ the estimate of its parameter is the average earthquake 
rate per time interval AT 

A = (25) 

where T is the time-span and n is total number of events in a catalog. The estimate of p for 
the geometric distribution's (JSj) is 

P = (26) 
1 + mi 

where rui = A is the average (the first moment). 

For the logarithmic distribution fill I) there is no simple expression to evaluate its pa- 
rameter p. Patil (1962, Table 1) and Patil and Wani (1965, Table 2) propose tables for 
calculating the maximum likelihood estimate (MLE) of the parameter after determining the 
average number of events. 

For the standard form (fT5|) of the NBD, we use (fT6ll and (fT7|) to obtain the estimates of 
the NBD parameters by the moment method 

S = (27) 
m2 



and 



2 

mf 



f = (28) 

1712 — f^l 

where mi and m2 are the average and variance of the empirical number distribution. Below 
we sometimes would use the term moment estimate for the estimate by the moment method. 
For the Poisson process, m2 = mi. Hence the estimate (l28l) would be unstable if the NBD 
process approaches the Poisson one, and the estimate uncertainty would be high. 

For the alternative form of the NBD fllSp . we obtain the following moment estimates 

A = mi, (29) 
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and 



a = ' . 30 
nil 

Evans' (1953) parameter a is estimated similarly to flSU]) 

ma -mi mg . 
a = = 1 . (31) 

mi mi 

Evans (1953, p. 203, see 'Estimation by Method V) derives approximate estimates of the 
parameters' uncertainties 

ax = ^X{a + 1)/N. (32) 

and 



aa ^ ^2{a + l)/N + a{a + l){3a + 2)/{XN). (33) 
as well as the covariance between these two estimates 

Cov(A, a) ^ a{a + l)/N. (34) 

In these equations is the number of time intervals with earthquake counts. 

The maximum likelihood estimate (MLE) of any parameters for the discussed distribu- 
tions can be obtained by maximizing the log-likelihood function 

oo oo 

£ = log n Ifikj) f'^'^^ = E P(f'j) log / (k,) , (35) 

j=0 j=0 

where P{kj) is the observational frequency of earthquake numbers in interval j. Function 
f{k) is defined by the expression f|T^ . f|T5l) . f|T8|) . or fl20|) for the Poisson, the standard NBD, 
the alternative NBD, and Evans' formula, respectively. To evaluate parameter uncertain- 
ties we need to obtain the Hessian matrix (the second partial derivatives of the likelihood 
function) of the parameter estimates (Wilks, 1962; Hilbe, 2007). 
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3 Earthquake catalogs 

We studied earthquake distributions and clustering for the global CMT catalog of moment 
tensor inversions compiled by the CMT group (Ekstrom et al, 2005). The present catalog 
contains more than 28,000 earthquake entries for the period 1977/1/1 to 2007/12/31. Earth- 
quake size is characterized by a scalar seismic moment M. The moment magnitude can be 
calculated from the seismic moment (measured in Nm) value as 

mw = (2/3) -logioM-e. (36) 

The magnitude threshold for the catalog is m5.8 (Kagan, 2003). 

The PDE worldwide catalog is pubhshed by the USGS (U.S. Geological Survey 2008); 
in its final form, the catalog available at the time this article was written ended on January 
1, 2008. The catalog measures earthquake size, using several magnitude scales, and provides 
the body-wave {rrib) and surface-wave (Ms) magnitudes for most moderate and large events 
since 1965 and 1968, respectively. The moment magnitude (mw) estimate has been added 
recently. 

Determining one measure of earthquake size for the PDE catalog entails a certain dif- 
ficulty. For example, Kagan (1991) calculates a weighted average of several magnitudes to 
use in the hkelihood search. Kagan (2003) also analyzes systematic and random errors for 
various magnitudes in the PDE catalog. At various times different magnitudes have been 
listed in the PDE catalog, and establishing their relationships is challenging. Therefore, 
we chose a palliative solution: for each earthquake we use the maximum magnitude among 
those magnitudes shown. This solution is easier to carry out and the results are easily re- 
producible. For moderate earthquakes usually or Ms magnitude is selected. For larger 
recent earthquakes the maximum magnitude is most likely mw Depending on the time 
period and the region, the magnitude threshold of the PDE catalog is of the order 4.5 to 4.7 
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(Kagan, 2003). 

The CalTech (CIT) dataset (Hileman et al, 1973; Hutton and Jones, 1993; Hutton et 
ai, 2006) was the first instrumental local catalog to include small earthquakes (m > 3), 
beginning with 1932/01/01. The available catalog ends at 2001/12/31. The magnitude 
threshold of the 1932-2001 catalog is close to m3 [however, see Kagan (2004) for the threshold 
variations after strong earthquakes]. In recent years, even smaller earthquakes have been 
included in the catalog, so for the 1989-2001 time period, a threshold of m2 is assumed. We 
selected earthquakes in a spherical rectangular window (latitudes > 32.5°N and < 36.5°N, 
longitudes > 114.0°W and < 122.0°W) to analyze. 

4 Earthquake numbers distribution 

4.1 Statistical analysis of earthquake catalogs 

Several problems arise when the theoretical considerations of Section [2] are applied to earth- 
quake catalogs. Due to the limited sensitivity of a seismographic network and its sparse 
spatial coverage, catalogs are incomplete in magnitude, time, and space (Kagan, 2003). In 
particular, the magnitude threshold of completeness varies in time, usually decreasing during 
the catalog time-span. Moreover, after strong earthquakes due to mainshock coda waves and 
interference by other stronger events, small aftershocks are usually absent for a period of 
a few hours to a few days (Kagan, 2004). In the best local catalogs this 'dead' period can 
be significantly reduced (Enescu et al, 2009), suggesting that the lack of aftershocks in the 
interval larger than a few minutes is an artifact of catalog incompleteness in the early stages 
of an aftershock sequence. 

An additional problem in comparing theoretical calculations to observations is identifying 
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earthquake clusters. Due to delays in dependent events temporal distribution described by 
Omori's law, clusters usually overlap in time and space. Only in zones of low tectonic 
deformation can aftershock clusters of large earthquakes be distinguished. Sometimes this 
aftershock sequence decay takes centuries (Ebel et al, 2000). In more active regions, many 
earthquake clusters overlap. 

It is possible, in principle, to use stochastic declustering in defining statistical interrela- 
tions between various events. For the branching-in-magnitude (Fig. [1^) such a procedure 
was first applied by Kagan and Knopoff (1976, see Table XVIII); for the branching-in-time 
model, Kagan and Jackson (1991) and Zhuang et al. (2004) proposed the identification pro- 
cedure. However, such stochastic decomposition is typically non-unique. It depends on the 
details of a stochastic model parametrization and has not been attempted in this work. For 
branching-in-time models, connections of small vs large events (such as foreshock-mainshock 
ties) create an additional difficulty in stochastic reconstruction. See more discussion on this 
topic at the end of Subsection 12.11 Handling these connections unambiguously is difficult. 

Moreover, the equations derived in Subsection 12.11 neglect temporal and spatial parame- 
ters of earthquake occurrence. Hence, they are valid for large space-time windows exceeding 
the size of the largest earthquake cluster. The time-span of available earthquake catalogs is 
very limited. For the largest earthquakes, the duration of the aftershock sequences is com- 
parable or exceeds a typical catalog length. Additionally, when studying earthquake number 
distribution, we need to subdivide the catalog into several sub-units, thus reducing again the 
temporal or spatial window. Furthermore, the theoretical model neglects long-term modu- 
lations of seismicity (Kagan and Jackson, 1991; Lombardi and Marzocchi, 2007). Therefore, 
we should not expect close agreement between the theoretical formula and empirical results. 
Only general regularities in distribution behavior can be seen. 
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4.2 Observed earthquake numbers distributions 

Fig. [2] shows the distribution of shallow (depth 0-70 km) aftershock numbers in the PDE 
catalog, following a m7.1 — 7.2 event in the CMT catalog. We count the aftershock number 
during the first day within a circle of radius R (Kagan, 2002) 

R{m) = 20-10"^^ km. (37) 

Even taking into account location errors in both catalogs, the radius of 200 km for the m7 
event guarantees that, almost all the first day aftershocks will be counted. The geometric dis- 
tribution curve seems to approximate the histogram satisfactorily. Comparing the observed 
cumulative distribution with its approximation in Fig. [3] also confirms that the geometric 
law appropriately describes the aftershock number distribution. 

For the geometric distribution. Fig. H] shows the dependence in the p-parameter on the 
mainshock magnitude. The p- values decay approximately by a factor of 10 "^'^^ with a 
magnitude increase by one unit. This behavior is predicted by Eqs. [Hand [71 

Fig. [5] displays an example of earthquake numbers in equal time intervals (annual in this 
case) for the CIT catalog (see also Fig. 5 by Kagan and Jackson, 2000). Even a casual 
inspection suggests that the numbers are over-dispersed compared to the Poisson process: 
the standard deviation is larger than the average. The number peaks are easily associated 
with large earthquakes in southern California: the m7.5 1952 Kern County event, the m7.3 
1992 Landers, and the m7.1 1999 Hector Mine earthquakes. Other peaks can usually be 
traced back to strong mainshocks with extensive aftershock sequences. 

In large time intervals, one would expect a mix of several clusters, and according to 
Eq. [9] the numbers would be distributed as the NBD. At the lower tail of the number 
distribution the small time intervals may still have several clusters due to weaker mainshocks. 
However, one small time interval would likely contain only one large cluster. Therefore, their 
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distribution would be approximated by the logarithmic law fllOtllip . Fig. [6] confirms these 
considerations. The upper tail of the distribution resembles a power-law with the exponent 
close to 1.0. 

The observed distribution in Fig. [6] is compared to several theoretical curves (c/. Kagan, 
1996). The Poisson cumulative distribution is calculated using the following formula 

oo 

F{k) = P{N<k) = ^/ y'^e^^dy = l-r(A; + l, A), (38) 

A 

where r(A; + 1, A) is an incomplete gamma function. For the NBD 

1 'f 

F{k) = PiN<k) = J y-''il-y)''dy, (39) 

where B{t, k + 1) is a beta function. The right-hand part of the equation corresponds to an 
incomplete beta function, B{t, k + 1, x) (Gradshteyn and Ryzhik, 1980). 

For the logarithmic distribution (fTOj) two parameter evaluations are made: one based 
on the naive average number counts and the other on number counts for a 'zero-truncated' 
empirical distribution (Patil, 1962, p. 70; Patil and Wani, 1965, p. 386). The truncation is 
made because the logarithmic distribution is not defined for a zero number of events. Thus, 
we calculate the average event count for only 60% of intervals having a non-zero number of 
earthquakes. 

These theoretical approximations produce an inadequate fit to the observation. The 
Poisson law fails because there are strong clusters in the catalog. The NBD fails for two 
reasons: the clusters are truncated in time and the cluster mix is insufficient, especially at the 
higher end of the distribution. Moreover, as we mentioned above, the long-term seismicity 
variations may explain the poor fit. The logarithmic distribution fails at the lower end, since 
several clusters, not a single one, as expected by the distribution, are frequently present in 
an interval. The quasi-exponential tail (see Eq. [T2|) is not observed in the plot, since in 
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the CIT catalog there are no events with a magnitude approaching the maximum (corner) 
magnitude. For Cahfornia the corner magnitude should be on the order of mS (Bird and 
Kagan, 2004). 

We produced similar plots for other global catalogs (the PDE and CMT); and the dia- 
grams also display a power-law tail for small time intervals. However, this part of a distri- 
bution is usually smaller. This finding is likely connected to the smaller magnitude range of 
these catalogs; fewer large clusters of dependent events are present in these datasets. 

Fig. [7| shows a cumulative distribution for annual earthquake numbers. Again the fit by 
the Poisson law 0381) is poor, whereas the NBD fl39p is clearly the better approximation. 

4.3 Likelihood analysis 

Fig. [8] displays a two-dimensional plot of the log-likelihood function fl35l) for the standard 
version of the NBD. Such plots work well for parameter estimates: if the relation is non- 
linear or the parameter value needs to be restricted (if, for example, it goes into the negative 
domain or to infinity, etc.) such plots provide more accurate information than does the 
second-derivative matrix. 

The diagram shows that the parameter estimates are highly correlated. Moreover, iso- 
lines are not elliptical, as required by the usual asymptotic assumption; thus, uncertainty 
estimates based on the Hessian matrix (see Eq. 1351) may be misleading. The 95% confidence 
limits obtained by the matlab (Statistics Toolbox) procedure testify to this. Wilks (1962, 
Chap. 13.8) shows that the log-likelihood difference is asymptotically distributed as |x^(2) 
(chi-square distribution with two degrees of freedom, corresponding to two parameters of the 
NBD model). Thus, the isoline [—3.0] at the log-likelihood map should approximately equal 
95% confidence limits. The moment estimates (Eqs. [271 and [281) are within the 95% limits. 

For the PDE catalog, where p = 0.993, the effect of the high correlation coefficient (p) 
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on parameter estimates of the standard NBD is demonstrated more strongly in Fig. [91 In 
addition to the parameters for the empirical frequency numbers, estimated by the MLE 
and the moment method, parameter estimates for 100 simulations produced, applying the 
MATLAB package, are also shown here. These simulations used the MLEs as their input. The 
parameters scatter widely over the plots, showing that regular uncertainty estimates cannot 
fully describe their statistical errors. 

Similar simulations with other catalogs and other magnitude thresholds show that for the 
standard NBD representation, simulation estimates are widely distributed over r, 6 plane. 
Depending on the original estimation method, the moment or the MLE, the estimates are 
concentrated around parameter values used as the simulation input. 

Fig. [To] displays the likelihood map for the alternative form ( fTSl) of the NBD. It shows a 
different behavior from that of Fig. [HI Isolines are not inclined in this case, indicating that 
the correlation between the parameter estimates should be slight. The simulation results 
shown in Fig. [11] confirm this. 

Fig. [12] shows the likelihood map for Evans' distribution. Again the isolines are less 
inclined with regard to axes, showing a relatively low correlation between the estimates. 
Using formulas f[32] - [34l) . supplied by Evans (1953, p. 203) we calculate 95% uncertainties 
shown in the plot. The correlation coefficient (p) between the estimates is ~ 0.15. 

Fig. [13] shows the dependence of the log-likelihood difference i — £o on the magnitude 
threshold (£o is the log-likelihood for the Poisson distribution, see Eq. [35]) . The difference 
increases as the threshold decreases, testifying again that large earthquakes are more Poisson. 
The Poisson distribution is a special case of the NBD. Therefore, we can estimate at what log- 
likelihood difference level we should reject the former hypothesis as a model of earthquake 
occurrence. Wilks (1962; see also Hilbe, 2007) shows that the log-likelihood difference is 
distributed for a large number of events as |x^(l) (chi-square distribution with one degree 
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of freedom. This corresponds to one parameter difference between the Poisson and NBD 
models). The 95% confidence level corresponds to the i — io value of 1.92. Extrapolating 
the curve suggests that earthquakes smaller than m6.5 in southern California cannot be 
approximated by a Poisson model. If larger earthquakes were present in a catalog, these 
events (m > 6.5) might be as clustered, so this threshold would need to be set even higher. 

4.4 Tables of parameters 

Tables [TH3] show brief results of statistical analysis for three earthquake catalogs. These tables 
display three dependencies of NBD parameter estimates: (a) on the magnitude threshold 
{rric)] (b) on time intervals (AT) a catalog time-span is subdivided; and (c) on a subdivision 
of a catalog space window. Three NBD representations are investigated: the standard, the 
alternative, and Evans' formula. Since the parametrization of the last two representations is 
similar, we discuss below only the standard and the alternative set of parameters. We used 
the moment method, which is more convenient in applications, to determine the parameter 
values. 

The parameter variations in all subdivisions exhibit similar features: 
• (a) In the PDE catalog the parameter a decreases as the threshold, rric, increases (a 
corresponds to the Poisson distribution). The ^- value displays the opposite behavior: when 
6 ^ 1.0, the NBD approaches the Poisson law. The 6 parameter shows a similar behavior in 
the CMT and CIT catalogs (the negative parameter values for rric = 7.0 in the CMT catalog 
reveal that the NBD lacks appropriate fit for our observations). The a parameter displays 
no characteristic feature for the CMT and CIT catalogs. The rric = 2.0 results for the CIT 
catalog are obtained for the period 1989-2001, so they cannot be readily compared to results 
for other magnitude thresholds. 

Fig. [m displays for all catalogs the dependence of 6 on the magnitude. The decay of 
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the parameter values can be approximated by a power-law function; such behavior can be 
explained by comparing Eqs. ([9]) and fl2T]) . This diagram is similar to Fig. 6 in Kagan and 
Jackson (2000), where the parameter 9 is called T, and the magnitude/moment transition 
is as shown in (!36|) . 

• (b) The ^-values are relatively stable for both global and the CIT catalogs, showing that 
the event number distribution does not change drastically as the time interval varies. It is 
not clear why the a significantly changes as the time intervals decrease. 

When the AT changes, the behaviour of the a and the 9, is generally contradictory: 
both parameters increase with the decrease of the time interval. This trend is contrary to 
the dependence of these variables on the magnitude threshold [see item (a) above]. The 9 
increase may suggest that the distribution approaches the Poisson law as AT 0, whereas 
the a trend implies an increase in clustering. Such an anomalous behaviour is likely to be 
caused by a change of the earthquake number distribution for small time intervals. Fig. [6] 
demonstrates that the upper tail of the distribution is controlled by the logarithmic series 
law fllOl) . Although the logarithmic distribution is a special case of the NBD, it requires 
one parameter versus NBD's two degrees of freedom. Therefore, it is possible that when the 
logarithmic distribution is approximated by the NBD, the parameters a and 9 of two different 
NBD representations behave differently. The dependence of the distribution parameters on 
the time sampling interval needs to be studied from both theoretical and observational points 
of view. 

Fig. [IS displays the 9 parameter behavior for changing time intervals. There is no regular 
pattern in the curves properties for the three catalogs. However, the ^-values for the 1-year 
and 5-year intervals can be used in testing earthquake forecasts for these catalogs. 

• (c) Two types of spatial subdivision are shown in the tables. For global catalogs we 
subdivide seismicity into five zones according to the tectonic deformation which prevails 
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in each zone (Kagan et al, 2009). The CIT catalog was subdivided into four geographic 
zones. We also made a similar subdivision for the global catalogs (not shown). In all 
these subdivisions both a and 6 are relatively stable, whereas a sum of parameter r-values 
for the subdivided areas approximately equals the parameter's value for the whole area. 
For example, the r-values for the CIT catalog are: 1.0888 ^ 0.5566 + 0.235 + 0.265 + 
0.5672 (1.6238). By definition, the A parameter equals the sum for the sub-areas. 

5 Discussion 

We studied theoretical distributions of earthquake counts. The obtained discrete distri- 
butions (Poisson, geometric, logarithmic, and NBD) are applied to approximate the event 
numbers in earthquake catalogs. The Poisson distribution is appropriate for earthquake 
numbers when the catalog magnitude range (the difference between the maximum and the 
threshold magnitudes) is small. The geometric law applies for the earthquake number dis- 
tribution in clusters (sequences) with a fixed magnitude range, whereas the logarithmic law 
is valid for all earthquake clusters. Finally, the NBD approximates earthquake numbers for 
extensive time-space windows if the magnitude range is relatively large. 

Our results can be used in testing earthquake forecasts to infer the expected number of 
events and their confidence limits. The NBD parameter values shown in Tables [lH3] could 
be so used. The 95% uncertainties can be calculated by using Eqs. f l55]) and fl5^ . 

We have shown that the different implementations of the NBD have dissimilar character- 
istics in the likelihood space of the parameter estimates (Figs. [8l fT2|) . The alternative (ITSl) or 
the Evans (1201) distributions clearly have parameter estimates that look as non-correlated, 
and on that basis they are preferable in the practical use, though additional investigations 
of their performance in real earthquake catalogs should be performed. 
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The moment method based estimates discussed in the paper fall into a high confidence 
level, thus these estimates are relatively effective. They are much simpler to implement and 
faster in terms of computational speed. However, their uncertainties need to be determined. 
Apparently such an evaluation can be done without much difficulty. 

As we mentioned in the Introduction, many studies have been published on certain as- 
pects of earthquake number distributions. But this work is first in combining the theoretical 
derivation with statistical analysis of major features of earthquake occurrence. As a pioneer 
paper, it cannot resolve several issues related to earthquake number distributions. We list 
some of them below, hoping that other researchers, both geophysicists and statisticians, will 
find their solutions. 

• 1) Many dependent events are not included in earthquake catalogs: a close foreshock 
may be misidentified as an initial stage of larger shock, rather than an individual event. The 
coda waves of a large event and strong aftershocks hinder identification of weaker earth- 
quakes. It is likely that these phenomena can be modeled by a more sophisticated scheme of 
population processes (such as age-dependent branching processes, etc.) and a more detailed 
quantitative derivation can be obtained. 

• 2) It would be interesting to obtain some uncertainty estimates for the moment-based 
parameter evaluations. Moment method estimates are easier to obtain; if the variances and 
covariances for parameter estimates can be calculated (as by Evans, 1953), this would signif- 
icantly increase their value. Though many uncertainty estimates were considered previously 
(Anscombe, 1950; Shenton and Myers, 1963; Johnson et al., 2005), they are difficult to 
implement in practical situations. 

• 3) We need to investigate the goodness-of-fit of the distribution of earthquake numbers 
in various time intervals to theoretical distributions, like the Poisson, logarithmic, and NBD. 

• 4) Discrete distributions more general than those investigated in this work need study. 
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It may be possible to establish for such distributions how their parameters depend on earth- 
quake catalog properties. Then the generalized distributions can offer new insight into 
earthquake number distribution. 

• 5) The lack of appropriate software hindered our analysis. Only moment method 
estimates could be used for the catalogs in all their subdivisions. Hilbe (2007) describes 
many commercial and free software packages (like SAS, S-plus, and R) that can be used 
for statistical studies. Their apphcation would facihtate a more detailed investigation of 
earthquake occurrence. However, as few geophysicists are familiar with these statistical 
codes, this may present a serious problem in applying this software. 
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Table 1: Values of NBD parameters for various subdivisions of the 1969-2007 PDE catalog 
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In column 1: G means that the global catalog is used, - plate interior, 1 - Active continent, 
2 - Slow ridge, 3 - Fast ridge, 4 - Trench (sul^iuction zones), see Kagan et al, 2009; n is the 
number of earthquakes, N is the number of time intervals, AT - interval duration in days. 



Table 2: Values of NBD parameters for various subdivisions of the 1977-2007 CMT catalog 
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In column 1: G means that the global catalog is used, - plate interior, 1 - Active continent, 
2 - Slow ridge, 3 - Fast ridge, 4 - Trench (subduction zones), see Kagan et al, 2009; n is the 
number of earthquakes, is the number of l34ie intervals, AT - interval duration in days. 



Table 3: Values of NBD parameters for various subdivisions of the 1932-2001 CIT catalog 



Subd 


rric 


n 


N 


A 




OL 


a 









T 


AT 




1 

± 


9 


Q 
o 


A 


o 




u 


7 




Q 
o 




Q 


in 




G 


6.0 


25 


70 


0.357 


2. 


,1360 


0.7629 


0, 


.56726 


0. 


,4682 


365, 


.2 


G 


5.0 


226 


70 


3.229 


1. 


,4422 


4.6564 


0, 


.17679 


0. 


,6934 


365, 


.2 




4.0 


2274 


70 


O O /I A 

32.49 


1. 


.297b 


42.154 


0, 


A O O 1 'V 

.02317 


0. 


,7706 


o r 

365, 


.2 


G 


3.0 


17393 


TA 

70 


248.5 


0. 


,9184 


OOO OA 

228.20 


o 

0, 


A A AO CI 

.00436 


1. 


AO OO 

,0888 


o^ tr 

365, 


o 

.2 


G 


2.0 


5391 


13 


A -t A l~T 

414.7 


1. 


OA r" o 

,2958 


r" o^ o 

537.36 


0, 


AA-1 O/^ 

.00186 


0. 


,7717 


365, 


.2 


G 


3.0 


17393 


5 


3478.6 


0. 


,1062 


369.46 


0, 


.00270 


9. 


,4155 


5113, 


.6 


G 


3.0 


17393 


10 


1739.3 


0. 


,1569 


272.87 


0, 


.00365 


6. 


,3740 


2556, 


.8 


G 


3.0 


17393 


20 


869.7 


0. 


,3389 


294.73 


0, 


.00338 


2. 


,9507 


1278, 


.4 


G 


3.0 


17393 


50 


347.9 


0. 


,6514 


226.58 


0, 


.00439 


1. 


,5353 


511, 


.4 


G 


3.0 


17393 


70 


248.5 


0. 


,9184 


228.20 


0, 


.00436 


1. 


,0888 


365, 


.2 


G 


3.0 


17393 


100 


173.9 


1. 


,1803 


205.28 


0, 


.00485 


0. 


,8473 


255, 


.7 


G 


3.0 


17393 


OA A 

200 


86.97 


-1 

1. 


,7691 


1 r o or 

153.85 


0, 


AA/^ A /■ 

.00646 


0. 


,5652 


1 0*7 

127, 


.8 


\j 


3.0 


17393 


CT A A 

500 


O y< TA 

34.79 


3. 


AO OO 

,9832 


1 o o cr r* 

138.56 


o 

0, 


.00717 


o 

0. 


O C 1 1 

,2511 


51, 


.1 


G 


3.0 


17393 


1 AAA 

1000 


1 'V OA 

17.39 


0. 


,5226 


1 1 o A r 

113.45 


0, 


A A O A 

.00874 


0. 


"1 r o o 

,1533 


o r 

25, 


.6 


bE 


3.0 


8281 


70 


118.3 


1. 


,7967 


212.55 


0, 


.00468 


0. 


n n 

,5566 


365, 


.2 


NE 


3.0 


2839 


70 


40.56 


4. 


,2550 


172.57 


0, 


.00576 


0. 


,2350 


365, 


.2 


SW 


3.0 


2622 


70 


37.46 


3. 


,7734 


141.34 


0, 


.00703 


0. 


,2650 


365, 


.2 


NW 


3.0 


3651 


70 


52.16 


1. 


,7630 


91.953 


0, 


.01076 


0. 


,5672 


365, 


.2 



In column 1: G means that the whole CIT catalog is used, SE - south-east part of south- 
ern California, NE - north-east, SW - south-west, NW - north-west; n is the number of 
earthquakes, N is the number of time interv^, AT - interval duration in days. 





Figure 1: 



Earthquake branching models: filled circles indicate observed earthquakes; open circles are 
unobserved, modeled events. Vertical thin lines separate unknown past events, the current 
catalog, and future events. The dashed line represents a magnitude threshold; the earthquake 
record above the threshold is considered to be complete. Many small events are not registered 
below this threshold, though some events are observed even below the threshold. Large circles 
denote the initial (or main) event of a cluster .Sorrows indicate the direction of the branching 
process: down magnitude axis in (a) and up time axis in (b). 
(a) Branching-in-moment (magnitude) model, (b) Branching-in-time model. 
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Number of Dependent Events 
Figure 2: 



Red bars - distribution of aftershock numbers m > 4.7 in the PDE catalog 1977-2007, fol- 
lowing 7.2 > mw > 7.1 earthquakes in the CMT catalog. The blue line is an approximation 
of the distribution using the geometric law (Eqs. [5] and [6]) with the parameter p = 0.1279, 
calculated using fl2^ . 
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Number of Dependent Events 
Figure 3: 

Red line - survival function (1 - Cumulative distribution) of aftershock numbers m > 4.7 
in the PDE catalog 1977-2007, following 7.2 > 171]^ > 7.1 earthquakes in the CMT catalog. 
The blue line is an approximation of the distribution using the geometric law (Eqs. [5] and 

ED. 
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Mainshock magnitude 
Figure 4: 

Dependence of the ]3-estimate on mainshock magnitude. The blue hne is for mS.O aftershocks 
during 1-day interval after a mainshock, the red line is for m4.7 aftershocks during 1-day 
interval after a mainshock, and the magenta line is for mA.7 aftershocks during a 7-day 
interval after a mainshock. Solid lines connect data points with more than 3 mainshocks, 
dashed lines are for a single mainshock, and dotted lines connect the estimate for 2004 m9.1 
Sumatra earthquake. The thin black line corresponds to p oc io~^-^™^ (see Eq. H]). 
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Figure 5: 



Annual numbers of earthquakes m > 5.0 in southern Cahfornia, 1932-2001. 
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Fig. 6 




Number of Earthquakes per 25.6 days (1000 intervals) 



Figure 6: 



Survival function (1 - Cumulative distribution) of earthquake numbers for the CIT catalog 
1932-2001, m > 4.0. The step-function shows the observed distribution in 25.6 days time 
intervals (1000 intervals for the whole time period). The green curve is the approximation 
by the Poisson distribution f|T4t [38]) : the magenta solid line is the NBD approximation 
(Eqs. and EH]), with parameters estimated by the MLE method; the magenta dashed 

curve is the NBD approximation with parameters are estimated by the moment method (see 
Subsection 12. Sp : the black dashed line is the approximation by the logarithmic distribution 
(ITU]) : the black solid line is the approximation by the logarithmic law for the zero-truncated 
distribution. 
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Fig. 7 
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Figure 7: 

Cumulative distribution of annual earthquake numbers for the CIT catalog 1932-2001, 
m > 5.0. The step-function shows the observed distribution, and the solid green curve is 
the theoretical Poisson distribution ( !38l) . Two negative binomial curves ( !39|) are also dis- 
played: for the dashed curve the parameters 6 and r are evaluated by the moment method, 
for the solid curve MLEs are used. The negative binomial curves fit the tails much better 
than the Poisson does. 
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Fig. 8 
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Figure 8: 

The log-likelihood function map for the CIT earthquake catalog 1932-2001, m > 5.0, annual 
event numbers are analyzed. The standard representation of the NBD (1151) is used. The 
green square is the estimate of 6 and r by the moment method, whereas blue circle shows 
the MLE parameter values. An approximate 95%-confidence area, based on asymptotic 
relations, corresponds to the contour labeled —3.0. Two orthogonal line intervals centered 
at the circle are 95% confidence limits for both parameters, obtained by matlab (Statistics 
Toolbox). The correlation coefficient p between these estimates (also evaluated by matlab) 
is 0.867. 
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Fig. 9 




Figure 9: 

The standard NBD parameter calculations for PDE earthquake catalog 1969-2007, m > 5.0, 
annual event numbers are analyzed. The large green square is the estimate of 6 and r 
by the moment method, whereas large blue circle shows the MLE parameter values: r = 
33. 98± 15.42 and 6 = 0.0274±0.0122. Two orthogonal line intervals centered at the circle are 
95% confidence limits for both parameters. Small circles are simulated parameter estimates, 
using MLEs. In simulations the parameter estimates for 9 and r are also MLEs (see above). 
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Figure 10: 

The log-likelihood function map for CIT earthquake catalog 1932-2001, m > 5.0, annual 
event numbers are analyzed. The alternative representation of the NBD (|T8l) is used. The 
green square is the estimate of 6 and A by the moment method. An approximate 95%- 
confidence interval, based on asymptotic relations, corresponds to the contour labeled —3.0. 
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Fig. 11 
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Figure 11: 

Simulation results for Caltech earthquake catalog 1932-2001, m > 4.0, annual event numbers 
are analyzed. The alternative representation of the NBD (ITSl) is used. The large green square 
is the estimate of a and A by the moment method for the catalog data. Small squares are 
simulated parameter estimates, using a and A as input. In these displays the parameters a 
and A are moment estimates. 
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Fig. 12 
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Figure 12: 

The likelihood function map for CIT earthquake catalog 1932-2001, m > 5.0, annual event 
numbers are analyzed. Evans' (1953) representation of the NBD (1201) is used. The red 
square is the moment estimate of a and A. An approximate 95%-confidence interval, based 
on asymptotic relations, corresponds to the contour labeled —3.0. Two orthogonal line 
intervals centered at the square are 95% confidence limits for both parameters, based on 
Evans' (1953) variance formula for moment estimates. 
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Figure 13: 

Dependence of the log-likelihood difference for the NBD and Poisson models of earthquake 
occurrence on the threshold magnitude. The CIT catalog 1932-2001 is used, annual event 
numbers are analyzed. The magenta line corresponds to i — io = 1.92; for a higher log- 
likelihood difference level the Poisson hypothesis should be rejected at the 95% confidence 
limit. 
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Fig. 14 




Figure 14: 

Dependence of the ^- value for the NBD model (ITSl) of earthquake occurrence on the threshold 
magnitude. Three catalogs are used: the green curve is for the CIT catalog 1932-2001, the 
red curve is for the PDE catalog 1969-2007, and the blue curve is for the CMT catalog 
1977-2007. The magenta line is 6' = 1.0, corresponding to the Poisson occurrence. Thin 
black line corresponds to ^ oc io^-^"^f^ (^see Eq. H]). 
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Figure 15: 

Dependence of the 6'- value for the NBD model f|T5|) of earthquake occurrence on time interval 
duration (AT). Three catalogs are used: the green curve is for the CIT catalog 1932-2001 
(circles are for m3 earthquakes and crosses for m5); the red curve is for the PDE catalog 
1969-2007, and the blue curve is for the CMT catalog 1977-2007. Two black vertical lines 
show 1-year and 5-year intervals. 
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